Population in Zürich

Zürich Statistical Office collects data on city and its residents. This data is published as Linked Data.

In this tutorial, we will show how to work with Linked Data. Mainly, we will see how to work with population dataset.
We will look into how to query, process, and visualize it.

1. Population in city districts
2. Population origin
3. Population distribution: age and time
4. Population distribution: age and sex
5. Population distribution: age and origin
6. Population and real estate prices
7. Causes of death

SPARQL endpoint

Population data is published as Linked Data. It can be accessed with SPARQL queries.
You can send queries using HTTP requests. The API endpoint is https://ld.stadt-zuerich.ch/query/.

Let's use SparqlClient from graphly to communicate with the database. Graphly will allow us to:

SPARQL queries can become very long. To improve the readibility, we will work wih prefixes.

Using add_prefixes method, we define persistent prefixes. Every time you send a query, graphly will add automatically update the prefixes for you.

Population in city districts

Let's find the number of inhabitants in different parts of the city. The data on restaurants is available in BEW data cube.

The query for number of inhabitants in different city districts, over time is:

Let's visualize number of inhabitants per district. To do this, we will aggregate the prices per place.
The cleaned dataframe becomes:

Population origin

Let's find the number of foreign and swiss inhabitants. The share of swiss/non-swiss population is available in ANT-GGH-HEL data cube. The population count is available in BEW data cube.

The query for number of inhabitants and foreigners share over time is:

Population distribution: age and time

Let's find the number of inhabitants in different age groups. The population count per age group is available in BEW-ALT-HEL-SEX data cube.

The query for number of inhabitants in various age buckets over time is:

Let's calculate the population share for each age group. The dataframe becomes:

Population distribution: age and origin

Let's take a look at age distribution among swiss and foreign inhabitants. We can find this data in BEW-ALT-HEL-SEX data cube.

The query for number of inhabitants in various age buckets, with their origin, over time is:

Let's calculate the population share for each origin and age group. The dataframe becomes:

Population distribution: age and sex

Let's take a look at age distribution for felame and male inhabitants. We can find this data in BEW-ALT-HEL-SEX data cube.

The query for number of inhabitants in various age buckets, with their sex, over time is:

Let's create a dataframe where one row represents one observation. It will allow us to use violin plots for our dataframe.
The dataframe becomes:

Population and real estate prices

Let's compare real estate prices and number of inhabitants over time. We will need to work with population and real estate data sets. The population data is available in BEW data cube. The real estate prices are in QMP-EIG-HAA-OBJ-ZIM data cube.

The query for number of inhabitants and housing prices over time is:

Causes of death

Statistical Office reports number of deaths, and its cause. Let's try to understand what are are main causes of death in Zurich. This data is available in GES-SEX-TOU data cube.

The query for death cause and its broader category is:

Let's aggregate those results under more meaningful groups names.